Position Heaps for Parameterized Strings
نویسندگان
چکیده
We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective mapping from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size.
منابع مشابه
The Position Heap of a Trie
The position heap is a text indexing structure for a single text string, recently proposed by Ehrenfeucht et al. [Position heaps: A simple and dynamic text indexing data structure, Journal of Discrete Algorithms, 9(1):100-121, 2011]. In this paper we introduce the position heap for a set of strings, and propose an efficient algorithm to construct the position heap for a set of strings which is ...
متن کاملConstructing LZ78 Tries and Position Heaps in Linear Time for Large Alphabets
We present the first worst-case linear-time algorithm to compute the Lempel-Ziv 78 factorization of a given string over an integer alphabet. Our algorithm is based on nearest marked ancestor queries on the suffix tree of the given string. We also show that the same technique can be used to construct the position heap of a set of strings in worst-case linear time, when the set of strings is give...
متن کاملParameterized Duplication in Strings: Algorithms and an Application to Software Maintenance
As an aid in software maintenance, it would be useful to be able to track down duplication in large software systems efficiently. Duplication in code is often in the form of sections of code that are the same except for a systematic change of parameters such as identifiers and constants. To model such parameterized duplication in code, this paper introduces the notions of parameterized strings ...
متن کاملAnalysis of String Sorting using Heapsort
In this master thesis we analyze the complexity of sorting a set of strings. It was shown that the complexity of sorting strings can be naturally expressed in terms of the prefix trie induced by the set of strings. The model of computation takes into account symbol comparisons and not just comparisons between the strings. The analysis of upper and lower bounds for some classical algorithms such...
متن کاملOn the Longest Common Parameterized Subsequence
The well-known problem of the longest common subsequence (LCS), of two strings of lengths n and m respectively, is O(nm)-time solvable and is a classical distance measure for strings. Another well-studied string comparison measure is that of parameterized matching, where two equal-length strings are a parameterized-match if there exists a bijection on the alphabets such that one string matches ...
متن کامل